Massive genomic data processing and deep analysis
نویسندگان
چکیده
منابع مشابه
Massive Genomic Data Processing and Deep Analysis
Today large sequencing centers are producing genomic data at the rate of 10 terabytes a day and require complicated processing to transform massive amounts of noisy raw data into biological information. To address these needs, we develop a system for end-to-end processing of genomic data, including alignment of short read sequences, variation discovery, and deep analysis. We also employ a range...
متن کاملAnalysis of Pre-processing and Post-processing Methods and Using Data Mining to Diagnose Heart Diseases
Today, a great deal of data is generated in the medical field. Acquiring useful knowledge from this raw data requires data processing and detection of meaningful patterns and this objective can be achieved through data mining. Using data mining to diagnose and prognose heart diseases has become one of the areas of interest for researchers in recent years. In this study, the literature on the ap...
متن کاملData Transmission and Massive Data Analysis Challenges
The proposed initiative is a multi-disciplinary effort involving the colleges of engineering and basic sciences. It has a dual long term goal: (a) to develop and implement telemetry suited for challenging environments capable of high throughput, and (b) to develop data analysis and visualization tools on parallel architectures capable of efficiently interfacing with experiments and processing m...
متن کاملPrivacy-Preserving Processing of Raw Genomic Data
Geneticists prefer to store patients’ aligned, raw genomic data, in addition to their variant calls (compact and summarized form of the raw data), mainly because of the immaturity of bioinformatic algorithms and sequencing platforms. Thus, we propose a privacy-preserving system to protect the privacy of aligned, raw genomic data. The raw genomic data of a patient includes millions of short read...
متن کاملParallel Processing for Scanning Genomic Data-Bases
The scan of a genomic database aims to detect similarities between dna or protein sequences. This is a time-consuming operation, especially when weak similarities are searched. Speeding up the scan can be managed using various strategies of paralleliza-tion. This paper presents two approaches carried on at irisa: systolic and distributed parallelization.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2012
ISSN: 2150-8097
DOI: 10.14778/2367502.2367534